Journal: bioRxiv
Article Title: A strategy for genome-wide seamless tagging of human protein-coding genes
doi: 10.1101/2025.03.04.641506
Figure Lengend Snippet: a , Visualization of mixed pools of cells in which endogenous proteins are fused to mClover3 using our CRISPR/Cas9-NHEJ-based strategy. Cells vary in fluorescence intensity and patterns of expression suggesting tagging of many proteins with varying abundances and different subcellular localizations. Scale bar is 50µm. b , NGS results indicate that ∼80 % of the human genes were tagged 5’ or 3’ to protein-coding sequences in the genome of HEK293T cells. c , The graph shows the distribution of abundance for all proteins expressed in HEK293T versus successfully or unsuccessfully detected targets; boxes represent 25th, 50th, and75th percentiles, and whiskers represent 1.5 times the interquartile range. Median is indicated by a white line. Outliers are not shown. Statistical significance p-value is 5.5 × 10 -28 and is calculated by Student’s t-test. Data indicates that low protein abundance posed a significant challenge to successful tagging, as the proteins that were not successfully tagged displayed low or no expression levels in HEK 293T cells. d , NGS results indicate that 89.7% of the essential genes were tagged 5’ or 3’ to protein-coding sequences in the genome. e , the distribution of abundance of essential proteins expressed in HEK 293T versus successfully or unsuccessfully tagged genes; boxes represent 25th, 50th, and 75th percentiles, and whiskers represent 1.5 times the interquartile range. Median is indicated by a white line. Statistical significance p-value is 1.02 × 10 -2 and is calculated by Student’s t-test. Data indicates that low protein abundance posed a challenge to successful tagging of essential genes, as the proteins that were not successfully tagged displayed low expression levels in HEK 293T cells. f , comparison of annotated protein localizations between PRISM and the Human Protein Atlas (HPA) datasets (37). The diagram features colored bands that correspond to groups of proteins with similar localization annotations between our data set and HPA. The width of each band is proportional to the number of proteins in the group. g , imaging analysis of individual successful targets. mClover3 fluorescence intensity and subcellular localization vary widely for each gene.
Article Snippet: The plasmid coding Cas9-NG (34) was purchased from Addgene (PX330-SpCAS9-NG).
Techniques: CRISPR, Fluorescence, Expressing, Comparison, Imaging